Multi-document summarization based on the Yago ontology

نویسندگان

  • Elena Baralis
  • Luca Cagliero
  • Saima Jabeen
  • Alessandro Fiori
  • Sajid Shah
چکیده

Sentence-based multi-document summarization is the task of generating a succinct summary of a document collection, which consists of the most salient document sentences. In recent years, the increasing availability of semanticsbased models (e.g., ontologies and taxonomies) has prompted researchers to investigate their usefulness for improving summarizer performance. However, semantics-based document analysis is often applied as a preprocessing step, rather than integrating the discovered knowledge into the summarization process. This paper proposes a novel summarizer, namely Yago-based Summarizer, ∗Corresponding author. Tel.: +39 011 090 7084. Fax: +39 011 090 7099. Email addresses: [email protected] (Elena Baralis), [email protected] (Luca Cagliero), [email protected] (Saima Jabeen), [email protected] (Alessandro Fiori), [email protected] (Sajid Shah) Preprint submitted to that relies on an ontology-based evaluation and selection of the document sentences. To capture the actual meaning and context of the document sentences and generate sound document summaries, an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process. The experimental results, which were achieved on the DUC’04 benchmark collections, demonstrate the effectiveness of the proposed approach compared to a large number of competitors as well as the qualitative soundness of the generated summaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-document summarization using closed patterns

There are two main categories of multi-document summarization: term-based and ontology-based methods. A term-based method cannot deal with the problems of polysemy and synonymy. An ontology-based approach addresses such problems by taking into account of the semantic information of document content, but the construction of ontology requires lots of manpower. To overcome these open problems, thi...

متن کامل

Ontology and Query-Focused Multi-Document Summarization System

Due to the increasing growth of online information on the specific topic, Multiple Document Summarization (MDS) has become a non-trivial task. The MDS facilitates the user to understand the large volume of information in a short time by creating a concise and comprehensive summary. In addition, user’s query based MDS system provides a consistent summary, including the core of the information. T...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

Deriving Event Relevance from the Ontology Constructed with Formal Concept Analysis

In this paper, we present a novel approach to derive event relevance from event ontology constructed with Formal Concept Analysis (FCA), a mathematical approach to data analysis and knowledge representation. The ontology is built from a set of relevant documents and according to the named entities associated to the events. Various relevance measures are explored, from binary to scaled, and from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2013